A bilingual multi-modal voice corpus for language and speaker recognition (LASR) services
نویسندگان
چکیده
Language and channel variations are two important concerns currently affecting practical automatic language and speaker recognition performance. To address these challenges, a corpus of speech was collected from 100 bilingual speakers in each of three foreign languages (Arabic-English, KoreanEnglish, and Spanish-English). The recordings were made in highly controlled conditions using multiple microphones simultaneously, each with different measured response characteristics. The speakers were asked to perform a set of speaking tasks including conversations, text independent readings, and prescribed text readings. These tasks were performed in English and in each speaker’s native language. The equipment, the recording procedures, and the data formats are presented, along with a preliminary analysis of recorded signal quality.
منابع مشابه
How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs
Many elements contribute to the relative difficulty in acquiring specific aspects of English as a foreign language (Goldschneider & DeKeyser, 2001). Modal auxiliary verbs (e.g. could, might), are examples of a structure that is difficult for many learners. Not only are they particularly complex semantically, but especially in the Malaysian context ...
متن کاملUCBN: A new audio-visual broadcast news corpus for multimodal speaker verification studies
The performance of face, voice, and multimodal speaker verification systems in complex and non-controlled scenarios, is typically lower than systems developed in highly controlled environments. With the aim to facilitate the development of robust multi-modal speaker recognition systems, a new multi-modal (audio-visual) Australian broadcast UCBN (University of Canberra Broadcast News) corpus was...
متن کاملLPFAV2: a New Multi-Modal Database for Developing Speech Recognition Systems for an Assistive Technology Application
In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription ...
متن کاملA New Multi-modal Database for Developing Speech Recognition Systems for an Assistive Technology Application
In this paper we report on the acquisition and content of a new database intended for developing audio-visual speech recognition systems. This database supports a speaker dependent continuous speech recognition task, based on a small vocabulary, and was captured in the European Portuguese language. Along with the collected multi-modal speech materials, the respective orthographic transcription ...
متن کاملThe MMSR bilingual and crosschannel corpora for speaker recognition research and evaluation
We describe efforts to create corpora to support and evaluate systems that meet the challenge of speaker recognition in the face of both channel and language variation. In addition to addressing ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and crosschannel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium, ...
متن کامل